Skip to content

CWE-230 Improper handling of missing values #947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dwiley258
Copy link
Contributor

No description provided.

Copy link
Contributor

@myteron myteron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few cosmetics, +1 otherwise.


## Non-Compliant Code Example

This noncompliant code example [[2024 docs.python.org]](https://docs.python.org/3/reference/expressions.html#value-comparisons) attempts a direct comparison with `NaN` in `_value == float("NaN")`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This noncompliant code example [[2024 docs.python.org]](https://docs.python.org/3/reference/expressions.html#value-comparisons) attempts a direct comparison with `NaN` in `_value == float("NaN")`.
The `noncompliant01.py` code example [[2024 docs.python.org]](https://docs.python.org/3/reference/expressions.html#value-comparisons) attempts a direct comparison with `NaN` in `_value == float("NaN")`.


## Compliant Solution

The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.
The `compliant01.py` the method `Decimal.quantize` is used to gain control over known rounding errors in floating point values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence also needs a small rewording. Maybe something like this?

Suggested change
The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.
In the `compliant01.py` cod example, the method `Decimal.quantize` is used to gain control over known rounding errors in floating point values.

Copy link
Contributor

@s19110 s19110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished the review. The code overall looks good, I had only one small comment to it, but there were also some small problems in the readme.

@@ -0,0 +1,148 @@
# CWE-230: Improper Handling of Missing Values

In python, some datasets use `NaN` (not-a-number) to represent the missing data. This can be problematic as the `NaN` values are unordered. The `NaN` value should be stripped before as they can cause surprising or undefined behaviours in the statistics functions that sort or count occurrences [[2024 doc.python.org]](https://docs.python.org/3/library/statistics.html) Any ordered comparison of a number to a not-a-number value are `False`. A counter-intuitive implication is that `not-a-number` values are not equal to themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we try to have one summary sentence in the first paragraph for the search engines, perhaps something like this could work?

Suggested change
In python, some datasets use `NaN` (not-a-number) to represent the missing data. This can be problematic as the `NaN` values are unordered. The `NaN` value should be stripped before as they can cause surprising or undefined behaviours in the statistics functions that sort or count occurrences [[2024 doc.python.org]](https://docs.python.org/3/library/statistics.html) Any ordered comparison of a number to a not-a-number value are `False`. A counter-intuitive implication is that `not-a-number` values are not equal to themselves.
The `NaN` value should be stripped before as they can cause surprising or undefined behaviours in the statistics functions that sort or count occurrences [[2024 doc.python.org]](https://docs.python.org/3/library/statistics.html).
In python, some datasets use `NaN` (not-a-number) to represent the missing data. This can be problematic as the `NaN` values are unordered. Any ordered comparison of a number to a not-a-number value are `False`. A counter-intuitive implication is that `not-a-number` values are not equal to themselves.


## Compliant Solution

The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence also needs a small rewording. Maybe something like this?

Suggested change
The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.
In the `compliant01.py` cod example, the method `Decimal.quantize` is used to gain control over known rounding errors in floating point values.


The `compliant01.py` the method Decimal.quantize is used to gain control over known rounding errors in floating point values.

The decision by the balance_is_positive method is to `ROUND_DOWN` instead of the default `ROUND_HALF_EVEN`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The decision by the balance_is_positive method is to `ROUND_DOWN` instead of the default `ROUND_HALF_EVEN`.
The decision by the `balance_is_positive` method is to `ROUND_DOWN` instead of the default `ROUND_HALF_EVEN`.


`Decimal` throws a `decimal.InvalidOperation` for `NaN` values, the controlled rounding causes only `"0.01"` to return `True`.

In `compliant02.py` we use the math.isnan to very if the value passed is a valid `float` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In `compliant02.py` we use the math.isnan to very if the value passed is a valid `float` value.
In `compliant02.py` we use the `math.isnan` to verify if the value passed is a valid `float` value.

This behavior is compliant with IEEE 754[[2024 Wikipedia]](https://en.wikipedia.org/wiki/IEEE_754) a hardware induced compromise.
The [example01.py](example01.py) code demonstrates various comparisons of `float('NaN')` all resulting in `False`.

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```python
```python
# SPDX-FileCopyrightText: OpenSSF project contributors
# SPDX-License-Identifier: MIT

*[noncompliant01.py](noncompliant01.py):*

```python
""" Non-compliant Code Example """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
""" Non-compliant Code Example """
# SPDX-FileCopyrightText: OpenSSF project contributors
# SPDX-License-Identifier: MIT
""" Non-compliant Code Example """

_value = float(value)
if math.isnan(_value) or _value is None:
raise ValueError("Expected a float")
if _value <= 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be slightly confusing because print(balance_is_positive("0.001")) returns false in compliant01.py but returns true here. If we assume the balance uses cents, we could just change the threshold in this condition:

Suggested change
if _value <= 0:
if _value < 0.01:

If we do so, the same should be done in noncomplaint01.py so that this part of the code remains unchanged.

|Tool|Version|Checker|Description|
|:----|:----|:----|:----|
|Bandit|1.7.4 on Python 3.10.4|Not Available||
|flake8|flake8-4.0.1 on python 3.10.4||FS002 '.format' used|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run flake8 on my machine and no error was shown. I don't see .format used in any code example, so I assume it's a leftover from an old version of the rule.

Suggested change
|flake8|flake8-4.0.1 on python 3.10.4||FS002 '.format' used|
|flake8|flake8-4.0.1 on python 3.10.4|Not Available||


|||
|:---|:---|
|[SEI CERT Coding Standard for Java](https://wiki.sei.cmu.edu/confluence/display/java/SEI+CERT+Oracle+Coding+Standard+for+Java)|[IDS06-J. Exclude unsanitized user input from format strings](https://wiki.sei.cmu.edu/confluence/display/java/IDS06-J.+Exclude+unsanitized+user+input+from+format+strings)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SEI CERT rule seems unrelated. I have found another one that talks specifically about NaN values:

Suggested change
|[SEI CERT Coding Standard for Java](https://wiki.sei.cmu.edu/confluence/display/java/SEI+CERT+Oracle+Coding+Standard+for+Java)|[IDS06-J. Exclude unsanitized user input from format strings](https://wiki.sei.cmu.edu/confluence/display/java/IDS06-J.+Exclude+unsanitized+user+input+from+format+strings)|
|[SEI CERT Coding Standard for Java](https://wiki.sei.cmu.edu/confluence/display/java/SEI+CERT+Oracle+Coding+Standard+for+Java)|[NUM07-J. Do not attempt comparisons with NaN](https://wiki.sei.cmu.edu/confluence/display/java/NUM07-J.+Do+not+attempt+comparisons+with+NaN)|

hedrok and others added 6 commits July 31, 2025 11:35
Before this commit the wording was that modifying list works but
is not recommended.

But it works as long as no two consecutive elements are deleted,
otherwise part of elements is not checked at all without any
exceptions raised.

Changed README.md, compliant01.py and noncompliant01.py to
demonstrate that.

Signed-off-by: Kyrylo Yatsenko <[email protected]>
Signed-off-by: Helge Wehder <[email protected]>
Signed-off-by: ewlxdnx <[email protected]>
…f#936)

* guide
Signed-off-by: balteravishay <[email protected]>

* guide

Signed-off-by: balteravishay <[email protected]>

* remove temp files
Signed-off-by: balteravishay <[email protected]>

* lint

Signed-off-by: balteravishay <[email protected]>

* Update docs/Security-Focused-Guide-for-AI-Code-Assistant-Instructions.md

Signed-off-by: Avishay Balter <[email protected]>

* Update docs/Security-Focused-Guide-for-AI-Code-Assistant-Instructions.md

Signed-off-by: Avishay Balter <[email protected]>

---------

Signed-off-by: balteravishay <[email protected]>
Signed-off-by: Avishay Balter <[email protected]>
Signed-off-by: ewlxdnx <[email protected]>
Signed-off-by: Helge Wehder <[email protected]>
Signed-off-by: ewlxdnx <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants